منابع مشابه
Clustering From Categorical Data Sequences
The three-parameter cluster model is a combinatorial stochastic process that generates categorical response sequences by randomly perturbing a fixed clustering parameter. This clear relationship between the observed data and the underlying clustering is particularly attractive in cluster analysis, in which supervised learning is a common goal and missing data is a familiar issue. The model is w...
متن کاملClustering Sequences of Categorical Values
Conceptual clustering is a discovery process that groups a set of data in the way that the intra-cluster similarity is maximized and the inter-cluster similarity is minimized. Traditional clustering algorithms employ some measure of distance between data points in n-dimensional space. However, not all data types can be represented in a metric space, therefore no natural distance function is ava...
متن کاملClustering Categorical Data
Dynamical systems approach for clustering categorical data have been studied by some authors [1]. However, the proposed dynamic algorithm cannot guarantee convergence, so that the execution may get into an in nite loop even for very simple data. We de ne a new conguration updating algorithm for clustering categorical data sets. Let us consider a relational table with k elds, each of which can a...
متن کاملClustering categorical data streams
The data stream model has been defined for new classes of applications involving massive data being generated at a fast pace. Web click stream analysis and detection of network intrusions are two examples. Cluster analysis on data streams becomes more difficult, because the data objects in a data stream must be accessed in order and can be read only once or few times with limited resources. Rec...
متن کاملClustering Data with Categorical Relationships
data aims at considering numeric data, categorical data or a mixture of both. For such data, the concentration was finding a relationship between the data points to be clustered. The relationships were limited to being either binary or fuzzy. Both involved a numeric value called distance or any other similarity measure between two data points and cluster them together if found similar. With tim...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Journal of the American Statistical Association
سال: 2015
ISSN: 0162-1459,1537-274X
DOI: 10.1080/01621459.2014.983521